<!-- # Where to download

The data for this project can be found in this [Google Drive](https://drive.google.com/file/d/1PcovyBZBnLdOfadfU1acDLDVLdBtL-MI/view?usp=sharing).


# File structure of the dataset:
```
root/
├── turk_qualification_check_20k_launch/ (This folder contains the images)
│   ├── Dress/
│   ├── Hoodie/
│   ├── Jacket/
│   ├── Knitwear/
│   ├── Shawl/
│   ├── Shirt/
│   ├── Suit/
│   ├── Sweater/
│   ├── T-shirt/
│   ├── Underwear/
│   ├── Vest/
│   └── Windbreaker/
├── turk_qualification_check_20k_launch_updated.json (This file contains the labels you need)
│
│   # Ignore the files below
├── results.csv
├── turk_qualification_check_20k_launch.csv
└── turk_qualification_check_20k_launch.json
```

# Labels

The `labels_meta.json` file contains labels for each sample.

## Label Noise Detection

This file is essentially the same as the `labels_meta.json` file in the clothing1mpp dataset, with the following differences:

1. It includes 20k samples that were randomly selected from the clothing1mpp training dataset.
2. It contains additional information in the `truk_results` field.

### `truk_results`

The `truk_results` field contains the votes and time for each label obtained from Amazon Mechanical Turk.

You can use the aggregated votes and time for the label noise detection task, or you can aggregate the votes and time yourself.

```
{
    "labels": [
        {
            "Labels": "Dress",
            "attributes": {
                "Color": "Yellow",
                "Material": "Cotton",
                "Pattern": "Striped"
            },
            "file_name": "000031.jpg",
            "file_path": "/Dress/Yellow_Cotton_Striped/000031.jpg",
            "id": 858643,
            "truk_results": {
                "aggre_time": 24063,
                "aggre_vote": "Yes",
                "note": [
                    "This is the votes to the main label",
                    "This is the votes to the main label",
                    "This is the votes to the main label"
                ],
                "time": [
                    16950,
                    331710,
                    24063
                ],
                "votes": [
                    "Yes",
                    "Yes",
                    "Yes"
                ]
            }
        },
        ...
    ],
    "meta_data": {
        "Sweater": {
            "Color": [
                "Black",
                "White",
                "Grey",
                ...
            ],
            "Material": [
                "Cotton",
                "Wool",
                "Cashmere",
                ...
            ],
            "Pattern": [
                "Cable knit",
                "Ribbed",
                "Fair Isle",
                ...
            ]
        },
        ...
    }
``` -->



# Clothing Dataset (Label noise detection)

This repository contains a dataset of clothing images along with their corresponding labels and metadata.

## Download

The data for this project can be found in this [Google Drive](https://drive.google.com/file/d/1PcovyBZBnLdOfadfU1acDLDVLdBtL-MI/view?usp=sharing).

## File Structure

The dataset has the following file structure:

```
root/
├── turk_qualification_check_20k_launch/ (This folder contains the images)
│   ├── Dress/
│   ├── Hoodie/
│   ├── Jacket/
│   ├── Knitwear/
│   ├── Shawl/
│   ├── Shirt/
│   ├── Suit/
│   ├── Sweater/
│   ├── T-shirt/
│   ├── Underwear/
│   ├── Vest/
│   └── Windbreaker/
├── turk_qualification_check_20k_launch_updated.json (This file contains the labels you need)
│   
# Ignore the files below
├── results.csv
├── turk_qualification_check_20k_launch.csv
└── turk_qualification_check_20k_launch.json
```

## Labels

The `labels_meta.json` file contains labels for each sample.

### Label Noise Detection

This file is similar to the `labels_meta.json` file in the clothing1mpp dataset, with the following differences:

1. It includes 20k samples that were randomly selected from the clothing1mpp training dataset.
2. It contains additional information in the `truk_results` field.

#### `truk_results`

The `truk_results` field contains the votes and time for each label obtained from Amazon Mechanical Turk. You can use the aggregated votes and time for the label noise detection task, or you can aggregate the votes and time yourself.

```json
{
  "labels": [
    {
      "Labels": "Dress",
      "attributes": {
        "Color": "Yellow",
        "Material": "Cotton",
        "Pattern": "Striped"
      },
      "file_name": "000031.jpg",
      "file_path": "/Dress/Yellow_Cotton_Striped/000031.jpg",
      "id": 858643,
      "truk_results": {
        "aggre_time": 24063,
        "aggre_vote": "Yes",
        "note": [
          "This is the votes to the main label",
          "This is the votes to the main label",
          "This is the votes to the main label"
        ],
        "time": [16950, 331710, 24063],
        "votes": ["Yes", "Yes", "Yes"]
      }
    },
    ...
  ],
  "meta_data": {
    "Sweater": {
      "Color": ["Black", "White", "Grey", ...],
      "Material": ["Cotton", "Wool", "Cashmere", ...],
      "Pattern": ["Cable knit", "Ribbed", "Fair Isle", ...]
    },
    ...
  }
}
```